-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/encode decode via lut #40
base: main
Are you sure you want to change the base?
Conversation
refactor: decode uses LUT, improve delta computation
feat: provide naive and LUT-based algorithms for both encode and decode
This is really interesting and I think worth pursuing. My understanding is that you've done at least 2 big things to make things faster:
My #42 does only the first part, and isn't quite as fast. But it's still a big boost: -70%-75%, and seemingly without any of the related slowdowns, so it feels not at all controversial. How would you feel about starting by merging #42, or you (@mattiZed) could pull the string allocation part out of this PR into a separate PR, I'd be happy to have your name on the commits if you'd prefer that - especially since you wrote it first. 🙃 My motivation is that, it might be illuminating to evaluate what exactly we're getting from the LUT without muddying the waters, since seemingly most of the speedup is from avoiding all the string allocations. |
I agree 100%. i will remove string allocation improvement from this pr so it can be merged first. then merging your pr after should be straight forward. will do it tomorrow. |
I was thinking let's merge the string allocation improvement first, since it's strictly an improvement. |
Either is fine for me, will update this PR now. |
… traceability of performance gains
And here are the new timings: Most of the encoding gain is/was from pushing to the same buffer all around, and will be retrieved again from #42.
|
Would you mind rebasing and running against the new benchmarks now that #42 is merged? I think/hope they'll be a little more stable. |
Hi Mike, im on vacation for this and next week and will continue work on this after. However, you could also take over if you'd like/it's urgent in any sense. Best, |
Is this LUT implementation something you're still interested in merging @mattiZed? Since we've, in the meanwhile merged a subset of the changes, the remaining work was some conflicts that need to be addressed. |
Okay, here we go. This should hopefully finally fix #39 and #37 as well as #35.
I have taken some inspiration from flexpolyline, I think the performance improvement on the encoder side comes mainly from computing the scaled longitude/latidude only once for each new coordinate pair - compare the loop in encode_coordinates(). When we subtract
previous
fromcurrent
,previous
is already scaled.EDIT: Seeing the performance gains made in #42 I'd rather attribute the improved encoder performance to removing the string concatenation and directly pusing to the result buffer.
I have changed the code to use the LUT implementation when it was faster on my local machine (M2 Pro). I choose to leave the respective alternative in the code. I encourage you to test on your system as well, maybe you report differently.
The main improvement regarding #39 and #37 comes from explicitly computing and comparing against the maximum shift that would still provide a valid longitude/latitude coordinate.
Timings compared to latest
main
:I don't quite know what to say about decoder performance. The conditionals sure are responsible for the performance hit, but it's still rather quick and not as abysmal as other solutions - I assume we'd have to sacrifice a bit of performance for robustness here?